Search results for "Computer Science - Databases"

showing 9 items of 9 documents

Distributed Real-Time Sentiment Analysis for Big Data Social Streams

2014

Big data trend has enforced the data-centric systems to have continuous fast data streams. In recent years, real-time analytics on stream data has formed into a new research field, which aims to answer queries about "what-is-happening-now" with a negligible delay. The real challenge with real-time stream data processing is that it is impossible to store instances of data, and therefore online analytical algorithms are utilized. To perform real-time analytics, pre-processing of data should be performed in a way that only a short summary of stream is stored in main memory. In addition, due to high speed of arrival, average processing time for each instance of data should be in such a way that…

Data streamFOS: Computer and information sciencesComputer Science - Computation and LanguageComputer sciencebusiness.industryData stream miningSentiment analysisBig dataMachine Learning (stat.ML)Databases (cs.DB)Data structurecomputer.software_genreField (computer science)Computer Science - Information RetrievalTree (data structure)Computer Science - DatabasesComputer Science - Distributed Parallel and Cluster ComputingAnalyticsStatistics - Machine LearningData miningDistributed Parallel and Cluster Computing (cs.DC)businesscomputerComputation and Language (cs.CL)Information Retrieval (cs.IR)

researchProduct

Parallel In-Memory Evaluation of Spatial Joins

2019

The spatial join is a popular operation in spatial database systems and its evaluation is a well-studied problem. As main memories become bigger and faster and commodity hardware supports parallel processing, there is a need to revamp classic join algorithms which have been designed for I/O-bound processing. In view of this, we study the in-memory and parallel evaluation of spatial joins, by re-designing a classic partitioning-based algorithm to consider alternative approaches for space partitioning. Our study shows that, compared to a straightforward implementation of the algorithm, our tuning can improve performance significantly. We also show how to select appropriate partitioning parame…

FOS: Computer and information sciencesComputer Science - DatabasesComputer Science - Distributed Parallel and Cluster ComputingParallel processing (DSP implementation)Computer scienceOrder (business)JoinsJoin (sigma algebra)Databases (cs.DB)Parallel computingDistributed Parallel and Cluster Computing (cs.DC)Computer Science::Databases

researchProduct

A Two-level Spatial In-Memory Index

2020

Very large volumes of spatial data increasingly become available and demand effective management. While there has been decades of research on spatial data management, few works consider the current state of commodity hardware, having relatively large memory and the ability of parallel multi-core processing. In this work, we re-consider the design of spatial indexing under this new reality. Specifically, we propose a main-memory indexing approach for objects with spatial extent, which is based on a classic regular space partitioning into disjoint tiles. The novelty of our index is that the contents of each tile are further partitioned into four classes. This second-level partitioning not onl…

FOS: Computer and information sciencesComputer Science - DatabasesDatabases (cs.DB)

researchProduct

RTIndeX: Exploiting Hardware-Accelerated GPU Raytracing for Database Indexing

2023

Data management on GPUs has become increasingly relevant due to a tremendous rise in processing power and available GPU memory. Just like in the CPU world, there is a need for performant GPU-resident index structures to speed up query processing. Unfortunately, mapping indexes efficiently to the highly parallel and hard-to-program hardware is challenging and often fails to yield the desired performance and flexibility. Therefore, we advocate to take a different route. Instead of proposing yet another hand-tailored index, we investigate whether we can exploit an indexing mechanism that is already built into modern GPUs: The raytracing hardware accelerator provided by NVIDIA RTX cards. To do …

FOS: Computer and information sciencesComputer Science - GraphicsComputer Science - DatabasesDatabases (cs.DB)Graphics (cs.GR)

researchProduct

Finding k -dissimilar paths with minimum collective length

2018

Shortest path computation is a fundamental problem in road networks. However, in many real-world scenarios, determining solely the shortest path is not enough. In this paper, we study the problem of finding k-Dissimilar Paths with Minimum Collective Length (kDPwML), which aims at computing a set of paths from a source s to a target t such that all paths are pairwise dissimilar by at least \theta and the sum of the path lengths is minimal. We introduce an exact algorithm for the kDPwML problem, which iterates over all possible s-t paths while employing two pruning techniques to reduce the prohibitively expensive computational cost. To achieve scalability, we also define the much smaller set …

FOS: Computer and information sciencesComputer scienceDatabases (cs.DB)0102 computer and information sciences02 engineering and technology01 natural sciencesSet (abstract data type)Exact algorithmComputer Science - Databases010201 computation theory & mathematicsIterated function020204 information systemsComputer Science - Data Structures and AlgorithmsShortest path problemScalabilityPath (graph theory)0202 electrical engineering electronic engineering information engineeringData Structures and Algorithms (cs.DS)Pairwise comparisonPruning (decision trees)AlgorithmProceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems

researchProduct

Open Data Quality Evaluation: A Comparative Analysis of Open Data in Latvia

2020

Nowadays open data is entering the mainstream - it is free available for every stakeholder and is often used in business decision-making. It is important to be sure data is trustable and error-free as its quality problems can lead to huge losses. The research discusses how (open) data quality could be assessed. It also covers main points which should be considered developing a data quality management solution. One specific approach is applied to several Latvian open data sets. The research provides a step-by-step open data sets analysis guide and summarizes its results. It is also shown there could exist differences in data quality depending on data supplier (centralized and decentralized d…

FOS: Computer and information sciencesGeneral Computer ScienceComputer sciencemedia_common.quotation_subjectStakeholderLatvianDatabases (cs.DB)Statistics - ApplicationsStatistics - Computationlanguage.human_languageComputer Science - Information RetrievalComputer Science - Computers and SocietyOpen dataLead (geology)Computer Science - DatabasesRisk analysis (engineering)Data qualityComputers and Society (cs.CY)languageMainstreamQuality (business)Applications (stat.AP)Information Retrieval (cs.IR)Computation (stat.CO)media_common

researchProduct

Application of LEAN Principles to Improve Business Processes: a Case Study in a Latvian IT Company

2020

The research deals with application of the LEAN principles to business processes of a typical IT company. The paper discusses LEAN principles amplifying advantages and shortcomings of their application. The authors suggest use of the LEAN principles as a tool to identify improvement potential for IT company's business processes and work-flow efficiency. During a case study the implementation of LEAN principles has been exemplified in business processes of a particular Latvian IT company. The obtained results and conclusions can be used for meaningful and successful application of LEAN principles and methods in projects of other IT companies.

FOS: Computer and information sciencesProcess managementGeneral Computer ScienceBusiness processLatvianDatabases (cs.DB)language.human_languageSoftware Engineering (cs.SE)Computer Science - Software EngineeringComputer Science - Computers and SocietyComputer Science - DatabasesComputers and Society (cs.CY)languageBusinessBaltic Journal of Modern Computing

researchProduct

Standard Vs Uniform Binary Search and Their Variants in Learned Static Indexing: The Case of the Searching on Sorted Data Benchmarking Software Platf…

2023

Learned Indexes are a novel approach to search in a sorted table. A model is used to predict an interval in which to search into and a Binary Search routine is used to finalize the search. They are quite effective. For the final stage, usually, the lower_bound routine of the Standard C++ library is used, although this is more of a natural choice rather than a requirement. However, recent studies, that do not use Machine Learning predictions, indicate that other implementations of Binary Search or variants, namely k-ary Search, are better suited to take advantage of the features offered by modern computer architectures. With the use of the Searching on Sorted Sets SOSD Learned Indexing bench…

I.2FOS: Computer and information sciencesComputer Science - Machine Learninglearned index structuresH.2Databases (cs.DB)search on sorted data platformComputer Science - Information RetrievalMachine Learning (cs.LG)E.1; I.2; H.2Computer Science - Databasesbinary search variantsComputer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)E.1algorithms with predictionSoftwareInformation Retrieval (cs.IR)

researchProduct

Identifying the k Best Targets for an Advertisement Campaign via Online Social Networks

2020

We propose a novel approach for the recommendation of possible customers (users) to advertisers (e.g., brands) based on two main aspects: (i) the comparison between On-line Social Network profiles, and (ii) neighborhood analysis on the On-line Social Network. Profile matching between users and brands is considered based on bag-of-words representation of textual contents coming from the social media, and measures such as the Term Frequency-Inverse Document Frequency are used in order to characterize the importance of words in the comparison. The approach has been implemented relying on Big Data Technologies, allowing this way the efficient analysis of very large Online Social Networks. Resul…

Social and Information Networks (cs.SI)FOS: Computer and information sciencesMatching (statistics)Social networkSettore INF/01 - Informaticabusiness.industryComputer scienceBig dataDatabases (cs.DB)AdvertisingComputer Science - Social and Information NetworksOnline Social Networks Social Advertising tf-idf Profile Matching.Term (time)Computer Science - Information RetrievalSet (abstract data type)Computer Science - DatabasesOrder (business)Computer Science - Data Structures and AlgorithmsData Structures and Algorithms (cs.DS)Social mediabusinessRepresentation (mathematics)Information Retrieval (cs.IR)

researchProduct